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processes and Markov renewal processes with Bernoulli noise. 



Abstract. We show that any loop-free Markov chain on a discrete space can be viewed as a determinantal point 
process. As an application, we prove central limit theorems for the number of particles in a window for renewal 



Resume. Nous montrons que toute chaine de Markov sans cycles sur un espace discret peut etre vue comme un 
processus ponctuel determinantal. Comme application, nous demontrons des theoremes limites centrales pour le 
nombre de particules dans une fenetre pour des processus de renouvellement et des processus de renouvellement 
markoviens avec un bruit de Bernoulli. 
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Introduction 

Let X be a discrete space. A (simple) random point process V on X is a probability measure on the set 2 X 
of all subsets of X. V is called determinantal if there exists a \X\ x |X| matrix K with rows and columns 
marked by elements of X, such that for any finite Y = (y±, . . . ,y n ) C X one has 



r{Xe2 x \YdX} = det[K ytyj ]l J=1 . 

5-i 



The matrix K is called a correlation kernel for V . 

A similar definition can be given for X being any reasonable space; then the measure lives on locally finite 
subsets of X. 

Determinantal point processes (with X = M) have been used in random matrix theory since the early 
60s. As a separate class, determinantal processes were first singled out in the mid-70s in [9] where the term 
fermion point processes was used. The term "determinantal" was introduced at the end of the 90s in [2] , 
and it is now widely accepted. We refer the reader to surveys [1, 8, 12, 15] for further references and details. 

Determinantal point processes have been enormously effective for studying scaling limits of interacting 
particle systems arising in a variety of domains of mathematics and mathematical physics including random 
matrices, representation theory, combinatorics, random growth models, etc. However, these processes are 
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still considered as "exotic" - the common belief is that one needs a very special probabilistic model to 
observe a determinantal process. 

The main goal of this note is to show that determinantal point processes are much more common. More 
exactly, we show that for any loop-free Markov chain the induced measure on trajectories is a determinantal 
point process. (Note that the absence of loops is essential - otherwise trajectories cannot be viewed as 
subsets of the phase space.) We work in a discrete state space in order to avoid technical difficulties like 
trajectories that may have almost closed loops, but our construction easily extends to suitable classes of 
loop-free Markov chains on continuous spaces as well. 

Surprisingly, very little is known about arbitrary determinantal point processes. However, in the case 
when the correlation kernel is self-adjoint, several general results are available, and even more is known 
when the kernel is both self-adjoint and translation invariant (with X = Z d or M. d ). See [1, 8, 12] and [15] for 
details. 

In our situation, the kernel is usually not self- adjoint, 1 and self-adjoint examples should be viewed as 
"exotic." One such example goes back to [9], see also [12], Section 2.4: It is a 2-parameter family of renewal 
processes - processes on Z or 1 with positive i.i.d. increments. Our result implies that if we do not insist 
on self-adjointness then any process with positive i.i.d. increments is determinantal. 

As an application, we prove a central limit theorem for the number of points in a growing window 
for renewal processes with Bernoulli noise and for Markov renewal processes (also known as semi-Markov 
processes) with Bernoulli noise. The proof is a version of the argument from [5, 13] adapted to non-self-adjoint 
kernels. 

The key property that allows us to prove the central limit theorem is the boundcdness of the operator 
defined by the correlation kernel in £ 2 (X). This boundedness is a corollary of what is known as a "renewal 
theorem" with a controlled rate of convergence. 

1. Markov chains and determinantal point processes 

Let X be a discrete space, and let P = [P xy ]x,y<£X. be the matrix of transition probabilities for a discrete time 
Markov chain on X. That is, P xy > for all x, y € X and 

Pxy = 1 for any i£l. 

yex 

Let us assume that our Markov chain is loop-free, i.e., the trajectories of the Markov chain do not pass 
through the same point twice almost surely. In other words, we assume that 

( pk )xx = for an y k > and x e x - 

This condition guarantees the finitcness of the matrix elements of the matrix 
Q = P + P 2 + P 3 + • • • . 

Indeed, (P k ) xy is the probability that the trajectory started at x is at y after fcth step. Hence, Q xy is the 
probability that the trajectory started at x passes through y ^ x, and since there are no loops, we have 
Qxy < 1. Clearly, Q xx = 0. 

Theorem 1.1. For any probability measure ir = [^Jjgj on X, consider the Markov chain with initial 
distribution ir and transition matrix P as a probability measure on trajectories viewed as subsets of X. Then 
this measure on 2 X is a determinantal point process on X with correlation kernel 

P^xy ^ x ~t~ (j^Q)x Qyx- 



In fact, it can be written as a sum of a nilpotent matrix and a matrix of rank 1. 
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Before proceeding to the proof, let us point out that for any Markov chain {X(n)} on a discrete state 
space X, its graph {(n,X(n)), n = 1, 2, . . .} is a loop-free Markov chain on X = Z>o x X. Hence, the graph 
defines a determinantal point process. 

Also, there exists a class of Markov chains {X(n)} such that for each n, X(n) is a random point configu- 
ration in some space Xo, and the graph {(n,X(n))} defines a determinantal point process on Z>o x Xo, see 
e.g. [3, 7, 16]. (Here X is a suitable space of point configurations in Xo.) In such a case, the graph carries 
two types of determinantal structures, the second one on Z >0 x X is afforded by Theorem 1.1. 

The author is grateful to one of the referees for these remarks. 

The proof of Theorem 1.1 is based on the following simple lemma, cf. [12], proof of Theorem 6. 



Lemma 1.2. 
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Proof. The determinant in question is an nth degree polynomial in a^s with the highest coefficient 
a>i ' ' ' a n b\ • • • o n , which vanishes at a\ =0, a 2 = Cx/61, . . . , a n = c n -\lb n -\- This implies the statement 
for 61 ■ ■ ■ b n 7^ 0, and thus, by continuity for all values of the parameters. □ 

Proof of Theorem 1.1. Let us evaluate the correlation function p n (xi, ■ ■ ■ ,x n ) using the Markov chain. 
First, let us reorder the points Xi,...,x n in such a way that Q Xi xj = for i > j. This is always possible 
because if Q xy > then Q yx = 0, and if Q xy > and Q yz > then Q xz > 0, so we are simply listing the 
elements of a finite partially ordered set in a nondecreasing order. (For example, we can first list all minimal 
elements, then all elements that are minimal in the remaining set, etc.) 

Once the points are ordered as described, using the Markov property we immediately compute 

Pn{x\ , ■ ■ ■ , X n ^ i.^ X \ ~l~ (j^Q) X \ ^)Q X \ X 2 ' ' ' Q Xn -i Xn ■ 

Lemma 1.2 (used with bi = 1) shows that this is exactly det[K XiXj ]^j =1 . □ 

Denote by 2) a finite subset of X such that if a trajectory of our Markov chain leaves 2) then it docs not 
return to 2) almost surely. In other words, if Q yz > for some y g 2J and z €E X \ 2J, then Q zy > = for any 

i/'e?). 

In order to consider the behavior of our Markov chain restricted to 2), it is convenient to contract the 
complement X\ 2J into one "final" state T . Then the transition matrix for the new Markov chain on 2J U {!F} 
coincides with P on 2) x 2) , and for any y € 2) 

Pvf = 1 - E p yy' = E p y^ p ry = °> p ^ = L 

y'&% zeX\<X> 

Set 

7Tj,(2)) = Prob{ Intersection of the random trajectory with 2J starts at y}, 
7To(2)) = Prob{The random trajectory does not intersect 2)}. 

Thus, 7ro(2)) = 1 — Syea) %(?))■ ^ n what follows, we will use the notation n y and ttq instead of 7Ty(2)) and 
7ro(2J) for the sake of brevity. 

Let K<g denote the restriction of the matrix K of Theorem 1.1 to 2J x 2). 
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Theorem 1.3. In the above notation, we have det(l — Keg) = ttq. I/ttq^O then the matrix L<g :=Ksg(l — 
Ktg)^ 1 has the form 

(Lft)) X y= WxPyjr -P yx: x,y&fQ. 

Proof. The first statement is a well-known corollary of Theorem 1.1 which can also be easily proved using 
the inclusion-exclusion principle. 

Assume that ttq ^ 0. For any subset Y of 2) choose the ordering (j/i, . . . , y m ) of its points in the same way 
as we did in the proof of Theorem 1.1: Q yiVj = for i > j. Then Lemma 1.2 yields 

det(L^\ YxY ) =7To 1 - TTy^y^ ■ ■ ■ Py m _ iym Py m r 



-1 



7T n 



Probjlntersection of the random trajectory with 2) is Y}. 



Hence. 



det(l + i gJ )= dct (^ly x y) = 7r o X ) 

VC5) 

and the matrix (1 + L<g) is invcrtiblc. 
Using the identity 

where by P<y we mean the restriction of P to 2) x 2); and noting the fact that P<g(l — -P<g) _1 gives the 
restriction Q<y of Q to 2) x 2) (here we use the assumption that the trajectories that leave 2} do not come 
back), we reduce the statement of the theorem to the following two relations: 



E ( TZTu) , = "* + E (irzj , p ^ = *°- 



The two sides of the first relation represent two ways of computing the probability that y 6 2) lies on 
the random trajectory of our Markov chain, and the second relation is easily verified once both sides are 
multiplied by (1 + i|, ) . □ 

2. Random point processes with Bernoulli noise 

Let X be a discrete set and let V be a random point process on X (that is, V is a probability measure on 
2 ). Given two sequences of numbers 

0<Px<l, 0<&<1, xeX, 

we define a new random point process V^ v ^ on X as follows. The random point configuration X C X is 
being transformed by the following rules: 

• Each particle i£l is removed with probability p x , and it remains in place with probability 1 — p x ; 

• At each empty location y £ X \ X a new particle appears with probability q y , and the location remains 
empty with probability 1 — q y ; 

• The above operations at different locations are independent. 

We say that is the process V with Bernoulli noise. 

Note that replacing the random point configuration X with its complement on the whole space X or on 
its part (the so-called particle-hole involution) can be viewed as a special case of the Bernoulli noise. 
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Wc will use the notation D = diag(d a: ) for the diagonal matrix with matrix elements 
\d x , x = y, e % 

xy \0, x ^y, x ^y^ zX - 

Theorem 2.1. For a determinantal point process V on X with correlation kernel K, the process with 
Bernoulli noise V^ p,q ^ is also determinantal, and its correlation kernel is given by 

K<**) = diag( fe ) + diag(l -p x - q x ) ■ K. 

Comments. 1. The theorem implies that if p x + q x = 1 then the noisy process V^ p,q ^ is independent of the 
initial process V , and it coincides with the product of Bernoulli random variables located at the points a; G X 
with probability of finding a particle at x equal to q x . This fact is easy to see independently. Indeed, 1 — p x 
and q x are conditional probabilities of finding a particle at x given that the initial process has or does not 
have a particle there, and p x + q x = 1 implies that these probabilities are equal. 

2. The particle-hole involution on X or its part corresponds to taking both p x and q x equal to 1 on the 
corresponding part of the space. 

3. The class of random point processes obtained as measures on trajectories of loop-free Markov chains is 
not stable under the application of Bernoulli noise. 

4. Theorem 2.1 can be viewed as a special case of [4], Theorem 

Proof of Theorem 2.1. We will consider the case when p x and q x are not equal to zero for just one 
x — xq G X. The general case is clearly obtained by composition of such transformations. 
From the definition of Bernoulli noise, we obtain 

V^ q \X) = (1 - Pxo )V(X) + q X0 V(X \ {x }) if xo g x. 

Summing over all X G X that contain a fixed finite set Y (which contains xo) we obtain 

p M (Y) = {l-p X0 ). P (Y) + q X0 ■ (p(Y \ {xo}) - P (Y)) = (1 - Pxo - q X0 ) ■ p(Y) + q XQ ■ p(Y \ {xo}) 

with 

p(Y) = V{X G 2 X \Y C X}, p^ q \Y) = V [p - q) {X G 2 X \Y C X}. 

One readily sees that the above relation is exactly the relation between the symmetric minors of the matrices 
j{(p,q) anc | k if the corresponding sets of rows and columns contain those marked by xo- On the other hand, 
the correlation functions p^-^ and p of V^ p ' q ^ and V away from xo are clearly identical, and same is true 
about the symmetric minors of K ij>,q> and K not containing the row and the column marked by xo- □ 

3. CLT for number of points in a window of a determinantal process 

Let X be a discrete set and P be a determinantal process on X with correlation kernel K . For any finite 
subset 2) of X, let us denote by N<x) the number of points of the random point configuration that lie in 2). 
Using the definition of the correlation kernel, it is straightforward to verify that 

ETVg = Tr K% , Var := E(JVg, - ENy f = Tx{K v - ) , 

where E denotes the expectation with respect to V and K<y is the restriction of K to 2} x 2}. 

The following statement can be essentially extracted from [5, 13] and [14]. An important difference though 
is that here we consider correlation kernels which are not necessarily sclf-adjoint. 
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Theorem 3.1. Assume that the correlation kernel K defines a bounded operator in £ 2 (3Z). Then for any 
sequence {2) m } of finite subsets of X such that 

lim |2) m |=oc, Var7V2, m >ci ■ |2} m | C2 , m > 1, 

m — *oo 

for some strictly positive constants c± and C2, the following central limit theorem for N^ m holds: 



lim Prob< 

[ y/V&rNz) m J V27T 

for any x € M. 

Proof. As is shown in [13], Lemma 1, for k > 2 the Ith cumulant of the random variable N<y is a linear 
combination of expressions of the form Tr(K<y — K^) with m ranging from 2 to I. Using the idea from [13], 
Lemma 2, we obtain 



|Tr(A^ - K%)\ < ^|Tr(/4 - A| +1 )| < £ \\K^-K^\ 

m — 1 m — 1 

<ni-.MiE n^ii fc <iii-^iiiEii A 'ii fc ' 



fc=i fe=i 



where we used the inequalities | Tr A\ < \\A\\i and ||AB||i < \\A\\ ■ ||-B||i. Since the trace norm of a matrix does 
not exceed the sum of the absolute values of its entries (and matrix elements of an operator are bounded 
by its norm), we see that 

lli-A^H^a + iiAiiHa)! 2 . 

Thus, the absolute value of any cumulant of N<g m starting from the second one grows not faster than a fixed 
constant times |2) m | 2 . Hence, the absolute value of the Zth cumulant of the normalized random variable 
(-^3)™ — E A<j) m ) / y/ Var N<y ~ is bounded by a constant times 



2 



(Var 



< cr 1 ■ |?)r 



|2-c 2 ;/2 



In particular, for large enough I it converges to zero as |2) m | — > oo. [14], Lemma 3, completes the 
proof. □ 



4. (Delayed) Renewal processes 

In this section, we will apply our previous results to one of the simplest examples of loop-free Markov chains 
- renewal processes. For the sake of simplicity, wc will consider only the discrete case. A large portion of our 
statements can be easily carried over to the continuous setting. 

Let £oj£ij£2, ■ ■ • be a sequence of mutually independent random variables with values in Z>o, such that 
£i,£2, ■ ■ ■ (but not £o) have a common distribution. By definition, the sequence 

So = £,o, <5i = £o + £i, S2 = £0 + £1 + £2, 

of partial sums forms a delayed renewal process, see, e.g., [6], Volume II, Section VI. 6. The word "delayed" 
refers to a differently distributed £o- 

Clearly, {Si}°l is a trajectory of the loop-free Markov chain on Z >0 with the initial distribution and 
transition probabilities given by 

7Tj =Prob{£ = i}, Pij = Prob{£i =j -i}, i,jeZ >0 . 
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Thus, by Theorem 1.1, the sequence {Si}°^L forms a determinantal point process on Z>o- 

Let us now restrict our attention to the case when the distributions of random variables £j have expo- 
nentially decaying tails. It is very plausible that the results proved below can be extended to more general 
classes of distributions. 

Definition. We say that a random variable £ with values in Z>o belongs to class £ if there exists a constant 
r G (0, 1) such that Prob{£ = n} < r" for large enough n. 

Definition. An integral valued random variable £ is called aperiodic if 

Prob{£ G N1} < 1 for any N>2, 

that is, the distribution of £ is not fully supported by NZ. 

The main result of this section is the following statement. 

Theorem 4.1. The delayed renewal process with aperiodic £i and £oj£i G £> an d with arbitrary Bernoulli 
noise defines a determinantal point process on Z>o whose correlation kernel may be chosen so that it repre- 
sents a bounded operator in £ 2 (Z>o). 

The proof is based on the following lemma. 

Lemma 4.2. Consider a random variable £ with values in Z>o which belongs to £ and is aperiodic. Set 

oo oo 
m— 1 n— 1 

Then there exist positive constants c\ and c 2 such that 
|/„ - (E£) _1 | < Cl e~ C2n for all n>l. 

Comments. The limit relation limn—^ /„ = (E£) _1 holds without the assumption that £ G £ and is com- 
monly called the "Renewal theorem," see, e.g., [6], Volume I, Chapter XIII. 

Proof of Lemma 4.2. Denote g(z) = Ez^. We have \g(z)\ < 1 for \z\ < 1, hence for \z\ < 1 

9(z) 



f {z)= Y J 9 m {z) = Y 



■9(z) 

Using the notation g\ = g'(l) = E£, we obtain 

/w=^+ (1 7/%~; M ;>: 1 . w<i. 

ffi(l-s) 3i(l-z)(l-.g(z)) 

Since £ G £, the function g{z) is holomorphic in a disc of radius greater than 1, in particular, it is holomorphic 
in a neighborhood of 1. Hence, using the notation g 2 = \g"{l) = |E(£(£ - 1)), we obtain 

(1 + gi(l - z))g{z) - 1 _ (1 + - z))(l + gjjz - 1) + g 2 (z - l) 2 + Q((z - l) 3 )) - 1 
5 i(l -z)(l- g(z)) 9l (l - zf{ 9l + 0(z - 1)) 

_ . gj -92 + o(z-i) 

g 2 (l + 0(z-l)) 
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as z — > 1. Thus, /(z) — 1 / (51 ( 1 — z)) is holomorphic in a neighborhood of z = 1. In addition to that, the fact 
that £ is aperiodic implies that g(z) is not equal to 1 at any point z 7^ 1 of the unit circle. Therefore, the 
function f(z) — 1/(51(1 — z)) can be analytically continued to a disc of radius greater than 1. This implies 
the needed estimate. □ 



Proof of Theorem 4.1. By Theorem 2.1, it suffices to prove the statement without Bernoulli noise. 
Lemma 4.2 applied to £ — £1 implies that there exist positive constants c\ and C2 such that 



IQtf-OB&r^cie 



C2(»-j) 



i < j. 



(Here we use the same notation as before: Q = P + P 2 + P 3 
as i — > 00 because £0 € £. Furthermore, 



We know that 7iVs decay exponentially 



|(ttQ) 3 --(E£)- 1 | 



»=i 



i=J 



i=l 



which is easily seen to be exponentially decaying in j ^ 00. 
Thus, the matrix 

Kij = n l + (irQ)i - Q jl7 i,j = 1,2,..., 

afforded by Theorem 1.1 has the following properties: 

\Kij\ <consti, i<j, 



1^1 <const 2 e- const3(i - j) +const 4 e- const5i , 



1 >j. 



for certain positive constants. This implies that if we consider the matrix K with matrix elements 
=e a ( % ~ : >'Kij with positive a < min{const3, consts} then the corresponding operator in ^ 2 (Z>o) will 
be bounded. On the other hand, the symmetric minors of K and K are clearly the same. Thus, K is the 
needed correlation kernel. □ 



Corollary 4.3. The central limit theorem for the number of particles in a window as described in The- 
orem 3.1 holds for the delayed renewal processes with Bernoulli noise which satisfy the assumptions of 
Theorem 1^.1. 

Comments. 1. In the absence of Bernoulli noise, the central limit theorem for iV[ 0i T] is an easy corollary of 
the classical central limit theorem for sums ofi.i.d. random variables, see, e.g., [6], Volume II, Section XI. 5. 
However, with the presence of Bernoulli noise the statement does not look as obvious. 

2. In concrete examples (including the one mentioned in the previous comment), the asymptotic behavior 
of the expectation and variance of N<g can often be computed explicitly using the formulas 

EN<y = Tr KfD , Var = Tr(K^ -K^). 



5. Markov renewal processes 



Markov renewal process (or semi-Markov processes) are hybrids of Markov chains and renewal processes 
introduced by Levy and Smith in 1954. An excellent introduction to Markov renewal processes can be found 
in [10] and [11]. 

We will use one of the simplest possible settings. 

Consider a Markov chain on X — S x Z>o with transition probabilities 

P((«l,tl)->(*2,t2)) = iWi2-tl), 
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where for any si,S2 G $ the function P SlS2 (t) is supported by Z>o, 

oo 

]TP SlS2 (t)=:P SlS2 e[0,i], Sl ,s 2 eS, 
t=i 

and [P Sl s 2 ]si,s 2 eS is the matrix of transition probabilities for a Markov chain on S. 

The projection of thus defined Markov chain on X to the first coordinate is the Markov chain on S 
sometimes called the "driving" Markov chain. The Markov chain on X is a Markov renewal process. It can 
be viewed as the driving Markov chain with randomly transformed time scale - the passage time from S\ to 
S2 is a random variable depending on s\ and S2- 

Theorem 5.1. Assume that S is finite and the driving Markov chain on S is irreducible. Assume further 
that whenever P ss i ^ 0, the passage time between s and s' (which is equal to t with probability P ss * (t) / P ss > ) 
is aperiodic and is in £ , and that the initial distribution for the Markov renewal process conditioned on the 
first coordinate being fixed (and arbitrary) is also in £. 

Then the point process on X formed by trajectories of the Markov renewal process, with any Bernoulli 
noise, is a determinantal point process, and the correlation kernel may be chosen so that it represents a 
bounded operator in £ 2 (X). 

Comments. 1. The assumptions of S being finite, the driving chain being irreducible and passage times 
being aperiodic and in £ can probably be relaxed. For example, instead of requiring that all passage times 
are aperiodic, one can impose the weaker condition of the first return time from any state to itself being 
aperiodic. 

2. Theorem J^.l is a special case of Theorem 5.1 obtained by considering the set S with a single element. 

Proof of Theorem 5.1. If we start the Markov renewal process at any state s then the times of its arrivals 
to any state s' form a delayed renewal process (see the previous section) with the initial delay £o being the 
first passage time from s to s' and the common distribution of £i , £2 , ■ ■ ■ being the first return time from s' 
to s'. 

Both £0 and £1 are in £ ; see Lemma 5.3. Furthermore, the aperiodicity of £1 immediately follows from 
the assumption of all passage times being aperiodic. Since the initial distribution of our Markov renewal 
process conditioned on fixing the first coordinate is also in £ , when we start from this initial distribution 
the arrival times to any state also form a delayed renewal process with same renewal time £1 , but generally 
speaking, different initial delay time £0, which however is still in £ . 

Thus, we can apply the arguments used in the proof of Theorem 4.1 to produce an exponential bound on 
the difference of Q((s,ti) — > (s',^)) and a known constant (equal to the inverse of the expectation of the 
first return time from s' to s'). 

The remainder of the proof is just the same as in Theorem 4.1: We conjugate the kernel afforded by 
Theorem 1.1 by a suitable diagonal matrix whose nonzero matrix elements form a geometric progression, 
and thus get the desired correlation kernel which defines a bounded operator in £ 2 (X). □ 

Corollary 5.2. The central limit theorem for the number of particles in a window as described in The- 
orem 3.1 holds for the Markov renewal processes with Bernoulli noise which satisfy the assumptions of 
Theorem 5.1. 

It remains to prove the following lemma. 

Lemma 5.3. Under the assumptions of Theorem 5.1, for any s, s' £ S the first passage time from s to s' is 
a random variable of class £ . 
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Proof. For any s±, S2 € S such that P SlS2 7^ set 

/sis 2 (^) T> ^ , Psi82 (^) % ■ 

By the hypothesis, all such functions have analytic continuation to a disc of radius larger than some constant 
R>1. Denote 

M= max |/ SlS2 0)|. 

\z\ = R,S!,s 2 £S 

Observe that M > 1 by the maximum principle because / Sl s 2 (l) = 1- 

The probability of the Markov renewal process started at s± to walk the path si — ► S2 s m and 

spend time T on this path can be estimated as follows: 

1 f fs 1S2 ( z ) ■ ■ ■ fs m -is m (z)dz 



^ Ps lS 2{tl)Ps2S 3 {t2) ■■ ■ Ps m - lSm (tm-l) — Ps lS2 '• ' Ps m - lSm ■ 7^7 J 



ti+---+t m -i=T 



r T+l 



Note that P SlS2 • • • P Sm -is m ^ s exactly the probability of the driving Markov chain started at si to walk the 
path si s m . 

Let p be the minimum of all P S is 2 7^ 0. Since the driving chain is irreducible, for any initial distribution 
its probability of hitting any given state in the first l^l steps is at least p' 5 '. Hence, the number of steps in 
the first passage of the driving chain from any si € S to any S2 € S is a random variable from £ . (Indeed, 
the probability of this number of steps being at least n is < (1 — p\ s \y n /\ s \] .) 

Take any a > such that M a < R. For a first passage of the Markov renewal process from si to S2 that 
takes time T, either the number of steps of the driving Markov chain is at least [aT] or it is < [aT]. The 
probability of the former event decays exponentially in T by the previous paragraph, while the probability 
of the latter event, by the estimate above, does not exceed M^ aT ^ / R T , which also decays exponentially in T 
by the choice of a. □ 
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